RED.LM6[LSP,JRA]2 - www.SailDart.org

perm filename RED.LM6[LSP,JRA]2 blob sn#216933 filedate 1976-05-22 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00005 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	The properties of terms and their approximate normal forms do not
C00009 00003	.group
C00019 00004	
C00024 00005	.ss(Another look at I and J)
C00032 ENDMK
C⊗;
The properties of terms and their approximate normal forms do not
as yet exhibit a direct connection between the structure of Scott's lattice
models and the reality of ?λ-calculus implementations, though they do provide
the motivation for several (semantical) properties of the models.  For instance
Theorems 16 and 17 capture the intuitive ideas (properly stated by theorem
18  and its corollaries) that: a)#the  behavior of terms is determined
by their approximate normal forms, and b)#unsolvable terms behave as totally
undefined functions.

We shall now attempt to establish a direct connection  with the means of computation
by showing that the equality $= (and the partial order π≤) can be characterized
syntactically as a relation definable from the conversion rules⊗↓Since
the conversion rules are incomplete for these models, we know that $c and $=
are not the same relation, but still a "reasonable" equivalence relation
between terms should be characterizable via the conversion rules.←.

The following are several equivalence relations between terms arising from our discussion
to date:

.begin indent 5,7;nofill;
%21) syntactic identity: ?M ≡ ?N
?M ≡ ?N iff ?M and ?N are identical terms

%22) interconvertibility: ?M $c ?N
?M $c ?N iff ?λ?β?h$ε?M = ?N

%23) equality in $D: ?M $= ?N
?M $= ?N iff $V?M|(πr) = $V?N|(πr) for all πr ε $N

%24) approximate normal form equivalence: ?M $a ?N
?M $a ?N iff ?M and ?N have the same set of (⊗O-reduced) approximate normal forms.

%25) normal form equivalence: ?M $n ?N
?M $n ?N iff (∀?C[ ],?C[?M] has a normal form iff ?C[?N] has the same normal form)

%26) head normal form equivalence: ?M $h ?N
?M $h ?N iff (∀?C[ ],?C[?M] has a h.n.f. iff ?C[?N] has a similar h.n.f.)

%27) solvably equivalent: ?M $s ?N
.fill
Here it is convenient also to define the partial order relation ?M#$≤#?N of
?M being %3solvably extended by%1 ?N:
.begin center;
?M $≤ ?N iff (∀?C[#], ?C[?M] solvable => ?C[?N] solvable)
.end
.indent 7,7;
Then we say that two terms are %3solvably equivalent%1 if each is solvably
extended by the other.
.end
The known implications among these equivalence relations are summarized in Figure 1.

.end "at|]"
.begin tabit3(10,20,25);group;
\\?M ≡ ?N
\
\
\\?M $c ?N
\
\\\    take ∪M ≡ Y%80%*, N ≡ Y%81∩
\\?M $a ?N
\
\\   \   probably holds
\\?M $n ?N
\\     
\\   \   take ?M ≡ ?I, ?N ≡ ?J
\\?M $= ?N
\\   \
\\  \   to be shown
\\?M $h ?N <=> ?M $s ?N

\                                  %2Figure 1.%1
.end

In particular, we have that when terms have the same set of approxiate
normal forms, not only will they yield (the same) normal forms
in the same contexts, i.e., 
.begin center;

?M $a ?N => ?M $n ?N from theorem 18a)
.end
but they will have the same values in Scott's models; i.e.,
.begin center;

?M $a ?N => ?M $= ?N from theorem 16.
.end

More generally, for arbitrary terms ?M and ?N it can be shown that:
.begin center;

(*)   ?M $n ?N => ?M $= ?N
.end

But the reverse implication does not hold. For example; taking
?M#≡#?I, ?N#≡#?J#≡#?Y%4λ∪(λf.λx.λy.x(fy))∩, and ?C[#] as the
null context,[#], ?C[?I]#≡#?I has a normal form while ?C[?J]#≡#?J does not,
so not ?I#$n#?J; but ?I#$=#?J by theorem 11#c).

Essentially, the property of having a normal form  is too strong for the converse
of (*) to hold.  However, if we consider the weaker property of having a
%2head%1 normal form, then the implications can be established in both
directions. Or, equivalently, since the property of having a h.n.f. 
is equivalent to that of being solvable (proof of lemma 3, {yon(R19)},
 required only h.n.f.),
we shall show that:
.begin center;

?M $= ?N iff ?M $s ?N
.end

From left to right, this is an easy consequence of terms being unsolvable
just when they have value ⊗b in $D.  In the other direction the full proof
is rather intricate but the result can be derived quite straightforwardly
with the aid of a lemma we shall state here without proof (lemma 21, below).

First, however, it is useful to introduce a weak version of the concept of
"separability". We define the partial order relation ?M#$!#?N of ?M being
%3semi-separable from%1 ?N by:
.begin center;

?M $! ?N iff ∃ ?C[ ], ∪C[M] $r∪ I, ∪C[N]∩ is unsolvable, 
.end
and we say that two terms are %3semi-separable%1 if either is semi-separable
from the other.

.group
%7LEMMA 19%1 If ?S is solvable, there is a context ∪C[ ]∩ such that
∪C[S]#$r∪#I∩, while uniformly for all unsolvable terms ?U, ∪C[U]∩ is
unsolvable. Thus, in particular:
.begin center

?S solvable, ?U unsolvable => ?S $! ?U.
.end
Proof:
.begin indent 10,10;
(for closed terms ?S) When ?S is solvable, it has a head normal form, say
.begin center;
∪S ≡ λ?x1N∪.?xi?X1K
.end
Take ∪C[ ] ≡ [ ]?M1?M2 ... ?Mn, where ?Mi ≡ ∪λ?y1?y2...?yk∪.I∩ and ?Mj ≡ ?I for
j#≠#i.
.end
.apart
.group
%7Corollary 19.1%1 Terms are semi-separable iff they are not solvably
equivalent; further
.begin center
?M $! ?N iff  ?M $≤ ?N
.end
The proof is immediate from the definitions and Lemma 19.
.apart
.group
%7Lemma 20%1 If ?U is unsolvable, then ∪I#=#U∩ is inconsistent with the
equality of all unsolvable terms.
Proof:
.begin tabit2(10,15);
\∪I = U∩ \=> ∪X = IX = UX∩ for all terms ?X
\\=> every term = an unsolvable term; but, when all unsolvable terms are equal this
\\=> all terms are equal (inconsistency)
.end
.apart
.group
%7Corollary 20.1%1 The equality of a solvable term and an unsolvable term
is inconsistent with the equality of all unsolvable terms. We will use $* 
to denote the extended theory of the ?λ-calculus with the additional inference
rule, %7Uns%1:
.begin tabit2(15,23);

\\?M, ?N both unsolvable
\%7Uns%1\--------------------
\\    ?M = ?N
.end
Thus:
?S solvable, ?U unsolvable => $* + ?S=?U inconsistent,
since $* is consistent, this implies  not##$*#$ε#?S=?U. 

Proof:
.begin indent 10,10;
Let ∪C[ ]∩ be the context given by Lemma 19. Then:
.begin tabit3(20,30,40);
\$* + ?S=?U $ε ∪I\=#C[S]∩\since ∪C[S] $r ?I
\       \?I=#∪C[U]∩\ from ?S = ?U by substitutivity
\       \?I=#?U'\where ?U' is unsolvable by choice of ∪C[ ]∩.
.end
which implies inconsistency by Lemma 20.
.end
.apart

.group
%7Lemma 21%1 Suppose ?A is in approximate normal form, ?N is an arbitrary
term such that  ?A#$¬#?N. Then ?A is not solvably extended by ?N, whence also ?A
is semi-separable from ?N. (proof omitted.)
.apart
Now we establish our characterization of $D. Actually, it is more convenient
to state and prove this for the inequality $≠, along with the stronger
characterization of ( $~ )⊗↓The latter written in parentheses for the
first three conditions of Theorem 22.←.

.group;
%7THEOREM 22%1 The following four conditions are equivalent:
.begin nofill;indent 5;
1) ?M $≠ ?N     (?M $~ ?N)
2) ?M and ?N are semi-separable   (?M $! ?N)
3) ?M and ?N are not solvably equivalent   ( ?M $≤ ?N)
4) $* + ?M=?N is inconsistent
.end
.apart
Proof:
.begin indent 10,10;
Since 2) and 3) are equivalent by corollary 19.1, it suffices to show that
1)#=>#2)#=>#4)#=>#1.
.begin indent 10,15;
.group
1)=>2) Since every term is the limit of its approximate normal forms,  ?M#$~?N
holds iff ?A#$~?N for some ?A#ε#⊗A(?M). Hence:
.begin tabit2(17,23);
\?M$~?N\=> ∃∪C[ ], C[A] $r∪ I, C[N]∩ unsolvable by Lemma 21.
\\=> ∃∪C[ ], C[M] $r∪ I, C[N]∩ unsolvable by Thm 18a) since ?A ε ⊗A(?M)
\\=> ?M $! ?N by definition of $!.
.end
.apart
.group

2)=>4) Suppose ?M $! ?N and let ∪C[ ]∩ be a context which semi-separates them; i.e.:
.begin center;
∪C[M] $r ∪I, C[N] = U'∩ is unsolvable.
.end
#####Then, as for the proof of corollary 20.1:
.begin center;

$* + ?M=?N $ε ∪I = C[M] = C[N] = U'∩ 
.end
#####from which the result follows by lemma 20.
.apart
.group

4)=>1) Assume $* + ?M = ?N is inconsistent. Since $D models $*, when ?M#$=#?N, 
$D also provides
a model of $*#+#?M=?N, which is therefore consistent, as the  $D model is
consistent, thus ?M#$≠#?N.
.end
.end
.group
For closed terms, the characterization of  $~ and similarly $≠, specializes
to:

%7THEOREM 23%1 For closed terms ?M and ?N, the following three conditions  are 
equivalent:
.begin nofill;
	a) ?M $~ ?N

	b) there exist closed terms ?X1k (k≥0) such that:
	      ?M?X1K $r ?I
	      ?N?X1K is unsolvable

	c) there is a closed term ?F such that
	      ∪FM $r ∪I∩
	      ∪FN∩ is unsolvable
.end
.apart
.group
Proof:
.begin indent 10,10;
That c)=>a) follows from theorem 22, by taking ∪C[#]#≡#F[#]∩ as the semi-separating
context.

That a)=>b) is a consequence of the construction of a semi-separating context for
closed terms in the proof of lemma 21 (which we, alas, omitted)

For b)=>c), simply set ∪F ≡ λz.z?X1K
.end
.apart

Some insight into the content of our characterization is provided by recalling
the intuitive interpretation of "$~" in Scott's theory, in terms of
"information content". For domain elements πa and πb, "πa$~πb" was 
intended to mean "πa gives some information where πb gives either no information
or different information". Then theorem 22 shows
that the sense of information embodied in the relation π≤ between 
terms induced by Scott's models is a good one, since it tells us that when there is a
difference in the "information content" of (the values of) terms, this can be detected
in their computational behavior --when ?M#$~#?N, there are two terms
∪A#≡#C[M]∩ and ∪B#≡#C[N]∩ differing only in one occurrence of ?M or ?N
such that the computation (reduction) for ?A halts and yields the identity
as output, while the computation for ?B not only fails to halt (no normal form),
it also never outputs any information as it proceeds (no head normal form).
[Roughly speaking, when  ?M$~?N, such a semi-separating context ∪C[#]∩ searches
the term for the information contained in ?M but not in ?N. If and when this is
found, for ?M, the appropriate subpart of ?M is extracted and
"normalized" to give the identity as result.
For ?N, the semi-separating context finds either "no information"
(an unsolvable term) or "different information" (a head normal form
dissimilar from that in this sub-part of ?M) in the corresponding part
of ?N; in either case the "normalizing" stage for ?M can be chosen
so that, when applied to this part of ?N, the resulting term fails to
have a head normal form.]

It is this aspect of our characterization of $≠ which has an obvious counterpart 
for programming languages. Suppose $1 and $2 are two well-formed "phrases"
(well-formed program fragments)
of a language %6L%1, and let us say that $1 and $2 are %3computationally
semi-discrete%1 if there are two programs
⊗P1 and ⊗P2 of language %6L%1, differing only in that ⊗P1 contains an occurrence
of $1 where ⊗P2 contains an occurrence of $2, such that the computation for ⊗P1
halts (say with output 0) while that for ⊗P2 continues indefinitely, giving
no output. Then, given a semantics for %6L%1, we can conjecture that if $1 and $2 are
semantically distinct (i.e., have different meanings), then they
are computationally semi-discrete. If this holds
(and the converse) we can reasonably claim that the given semantics for %6L%1
has the appropriate properties for reasoning about the 
computational behavior of programs of %6L%1.

The characterization of $≠ in terms of the inconsistency of extensions of $* 
(part 4, theorem 22) also has some interesting consequences.

.group;
%7Corollary 22.1%1 There is no consistent extension of the relation $=
as an equivalence relation between terms; indeed, $= provides the
unique, maximal, consistent model of the theory $*.
.apart
.group
Proof:
.begin indent 10,10;
Let ⊗= be any equivalence relation between terms which provides a consistent model
of $*. Then:
.tabit3(15,23,40);
\?M⊗=?N\=> $* + ?M=?N consistent\since ⊗= models it
\\=>?M$=?N\by Theorem 22.
.end
.apart
.fill

We can read this corollary as saying: given that we wish to consider all unsolvable
terms equivalent (which we do from earlier discussion), terms have different
values in $D iff this is necessary for consistency. Or, equivalently, terms
are equal in $D iff there is no reason to distinguish between them (where, of course,
we regard terms being computationally semi-discrete as a necessary and sufficient reason for distinguishing
between them).

We have shown that for the ?λ-calculus we have incompleteness in that everything
which is provable is true in the models, but there exist true statements (equations
between terms) in the models which are not provable.  We have completeness with
respect to truth in the models and computation, for solvable terms, 
that is, two solvable terms are equal in the
models iff they have similar head normal forms
in every context.  This does not give us a decision
procedure for truth.  Similarly to the result we had in Predicate Calculus (completeness
but no decidability, although in that case we had completeness among all three
ideas) we know that %2if∩ we have two solvable terms which are in fact equal then
we can compute similar h.n.f.'s for them, and %2if∩ we can compute similar h.n.f.'s
for two terms then they are in fact equal in the models; but again we have no
way of knowing whether our head reduction will terminate until it does.
.ss(Another look at I and J)
Finally, we consider again the terms ?I and ?J 
.begin center;
where ∪I ≡ λx.x∩ and ∪J ≡ YF4λ∪F ≡ (λf.(λx.f(xx))(λx.f(xx)))(λf.λx.λy.x(fy))∩
.end
which are equal in Scott's models although ?I has a normal form while ?J does
not.  At first this result was surprising because it was felt there was a close
connection between the properties of terms having a normal form and termination
of programs.  However non-termination is more satisfactorily captured in the λ-calculus
by "unsolvability" than by the property of terms failing to have a normal
form.  But this alone cannot be said to imply that we should necessarily expect
or desire that, for example, ∪I=J∩ holds in a "reasonable" computational model;
to rationalize this requires further argument.  Of course, the general 
consequences of our characterization of $= above is one reason, but there are
others which are more intuitive.

1.  ∪I=J∩ is a ?λ-calculus analogue of 3=2.999... . Under a suitable formalization
of "infinite normal form" suggested by the study of approximate normal forms,
the term ?J will have an infinite normal form, roughly
.begin center;
∪J=λ?x0∪.λ?x1∪.?x0∪(λ?x2∪.(?x1∪(λ?x3∪.?x2∪(λ?x4∪.?x3∪(...)))))∩
.end
while ∪I≡λx.x∩ is a finite normal form.  Thus, in saying that ∪I=J∩ is a kind of
analogue of 3=2.999... , we mean that it is an example of a finite normal form
which is equivalent to an infinite one, just as 3=2.999... is an example of a
real number which has both a finite and an infinite decimal representation.

2.  ?J represents an infinite number of applications of the ?h-expansion 
operation.  The operation ⊗H of a single ?h-expansion can be expressed by the
equation
.begin center;
⊗H∪(x) = λy.xy  ,  y≥x∩
.end
and of course ⊗H will be an identity operation in any extensional model of
the ?λ-calculus.  Suppose now that instead of performing just one ?h-expansion,
as above, by means of ⊗H, we consider an infinite ?h-expansion operator
⊗H%4∞∩ obtained by recusively applying ?h-expansion to the newly introduced
variable ?y, i.e., the recursive operation
.begin center;
⊗H%4∞∪(x) = λy.x(⊗H%4∞∪(y)) , y≥x∩.
.end
But this equation is just that which is satisfied by the term ?J, so ?J
represents an infinite number of applications of an identity operation.

3.  ?I and ?J are "almost interconvertible" or "interconvertible in the limit".
This is an example of a more general phenomenon which also gives a characterization
of the relation $= (and which is essentially the idea underlying the proof
of the unproved Lemma 21 above).  In the case of ?I and ?J we can illustrate
this by the conversions of ?I and ?J in Figure 2.

.group
.begin tabit3(4,15,45);nofill;

Agree to\\
depth\##########?I\##########?J
________________________________________________________________________________________

\0\∪λx.x\Y%4λ∪F    F≡λf.λx.λy.x(fy)∩


\1\∪λx.λy.x(y)\λx.λy.x(Jy)∩⊗↓Recall that ∪Y%4λ∪F=FY%4λ∪F∩, for any ∪F∩, thus
∪J#=#FJ#=#(λf.λx.λy.x(fy))J∩#β-$r#∪λx.λy.x(Jy)∩; and thus, ∪Jy#β-$r#∪λz.y(Jz)∩.←


\2\∪λx.λy.x(λz.y(z))\λx.λy.x(λz.y(Jz))∩


\3\∪λx.λy.x(λz.y(λw.z(w)))\λx.λy.x(λz.y(λw.z(Jw)))

\...\...\...


.begin center;
%2Figure 2∩

.end
.end
.apart

We have applied suitably chosen ?h-expansions to ?I and ?α?β-reductions to ?J, from
which we can see that, as these conversions proceed, the parts where the terms 
differ is pushed deeper and deeper inside the terms.  In other words, although ?I
and ?J are not finitely interconvertible, they can be transformed, by means
of the conversion rules alone, to terms which match each other syntactically up
to arbitrarily large, finite depth.

Since infinite objects are reasonably called "computable" if they can be obtained
as effective limits of better and better finite approximations, these three
arguments above lead to the conclusion that the equality of ?I and ?J, and
more generally the characterization of $= in Theorem 22, must be regarded as
natural consequences whenever one accepts that the rule of ?h-conversion
is valid.